Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: the position of _read() is wrong #38292

Closed
wants to merge 1 commit into from

Conversation

helloyou2012
Copy link
Contributor

Fixes: #33940

@nodejs-github-bot nodejs-github-bot added fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Apr 19, 2021
@VoltrexKeyva
Copy link
Member

Hi @helloyou2012, could you remove the merge commit from your branch? Merge commits tend to break the tooling, you can use git rebase instead.

As a best practice, once you have committed your changes, it is a good idea
to use `git rebase` (not `git merge`) to synchronize your work with the main
repository.

@benjamingr
Copy link
Member

@nodejs/streams

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is wrong? Can you please expand?

Can you please add a unit test?

Copy link
Member

@ronag ronag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs test and also a little unsure what exactly we are solving here. If I would guess:

  • Currently this.pos could actually be larger than the file size, which is a little weird.
  • Currently this.pos is what we have requested, but necessarily what we have read.

@helloyou2012
Copy link
Contributor Author

@mcollina see this issue: #33940, this

this._read(state.highWaterMark);
will call _read(n) multiple times with n= highWaterMark, but the bytesRead may be not equal to n, then the data will loss. Example:

// file data is: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
// pos = 0;
_read(10)
// if bytesRead = 3 then buf = [0, 1, 2]
// pos = 10
_read(10)
// if bytesRead = 10 then buf = [10, 11, 12], so [3, 4, 5, 6, 7, 8, 9] is loss.

@Ayase-252
Copy link
Member

Ayase-252 commented Apr 19, 2021

The problem arised because current implementation assumed next position to start reading is this.pos + n

if (this.pos !== undefined) {
this.pos += n;
}

But in reality, it does not necessarily read n bytes from file. In the callback of file handle read, it is possible that bytesRead does not equal n (less than n in my exprimentation)

.read(this.fd, buf, 0, n, this.pos, (er, bytesRead, buf) => {

As a result, some bytes in file may be skipped. The "assumed" reading is longer than the "actual". pos moves too far.

About the test, the issue occurred because read in the above line reads less bytes bytesRead than requested n. How could we reproduce (or mock) such condition in test?

Copy link
Member

@ronag ronag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change could make sense. But we need a test.

@benjamingr
Copy link
Member

About the test, the issue occurred because read in the above line reads less bytes bytesRead than requested n. How could we reproduce (or mock) such condition in test?

Good question. I'm not sure I don't think I ever wrote such a test in Node -

A good way to try is to take the test case from #33940 and to try to make a smaller isolated case from it (shorter interval and smaller watermark)

@nodejs/fs

@helloyou2012
Copy link
Contributor Author

I added test case. @ronag @benjamingr

@nodejs-github-bot
Copy link
Collaborator

highWaterMark: hwm,
start: cur
});
stream.on('data', common.mustCallAtLeast((chunk) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this is a mustCAllAtLeast and not a mustCall?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mustCall is exact times, but this place will call at least 1 time not exact 1 time.

Copy link
Member

@benjamingr benjamingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if the test passes after this and fails in master

@benjamingr
Copy link
Member

Also thank you for the meaningful contribution and working with us 🙏

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, hopefully the setInterval test would not fail in CI.

Copy link
Member

@ronag ronag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Was wondering why this hasn't landed yet. Turns out it's my fault.

@ronag ronag added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. and removed needs-ci PRs that need a full CI run. labels Apr 25, 2021
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@ronag
Copy link
Member

ronag commented Apr 26, 2021

I think there is a problem on arm with the new test. @helloyou2012

@helloyou2012
Copy link
Contributor Author

I think there is a problem on arm with the new test. @helloyou2012

Yes, the test timed out. I think it is better to set a time after which to exit safely. I will fix later.

@jasnell
Copy link
Member

jasnell commented Apr 28, 2021

Removing the author ready label as there's still work to be done here.

@jasnell jasnell removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 28, 2021
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@ronag
Copy link
Member

ronag commented May 2, 2021

@helloyou2012 this still seems to have problems passing. Could you try and look at the CI failures? I started another CI just in case.

@nodejs-github-bot
Copy link
Collaborator

@helloyou2012
Copy link
Contributor Author

CI: https://ci.nodejs.org/job/node-test-pull-request/37873/

It failed cause of inspector-cli/test-inspector-cli-pid timed out, not of this.

@helloyou2012
Copy link
Contributor Author

CI: https://ci.nodejs.org/job/node-test-pull-request/37875/

This CI all passed.

@ronag ronag added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label May 2, 2021
@mcollina
Copy link
Member

mcollina commented May 3, 2021

Landed in d826f6b

@mcollina mcollina closed this May 3, 2021
mcollina pushed a commit that referenced this pull request May 3, 2021
Fixes: #33940

PR-URL: #38292
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Robert Nagy <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
targos pushed a commit that referenced this pull request May 3, 2021
Fixes: #33940

PR-URL: #38292
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Robert Nagy <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
@targos targos mentioned this pull request May 3, 2021
targos pushed a commit that referenced this pull request May 30, 2021
Fixes: #33940

PR-URL: #38292
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Robert Nagy <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
targos pushed a commit that referenced this pull request Jun 5, 2021
Fixes: #33940

PR-URL: #38292
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Robert Nagy <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
targos pushed a commit that referenced this pull request Jun 5, 2021
Fixes: #33940

PR-URL: #38292
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Robert Nagy <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
targos pushed a commit that referenced this pull request Jun 11, 2021
Fixes: #33940

PR-URL: #38292
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Robert Nagy <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. fs Issues and PRs related to the fs subsystem / file system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

readStream with highWaterMark causes 'data' event to miss data
9 participants